Using JSON Columns in MySQL 8.0: When to Use and How They Are Stored/Indexed
MySQL 8.0 supports a native JSON data type that is validated and stored in a binary optimized format. It is useful in cases where schema flexibility is required or when semi-structured data needs to be stored without rigid normalization.
When storing semi-structured or flexible data where fields vary between rows (e.g., dynamic attributes, settings, metadata).
When application-layer objects map directly to JSON, reducing transformation overhead.
When you need rapid schema evolution without altering table structures.
When storing large, nested objects that would otherwise require multiple join-heavy relational tables.
When performance favors fewer joins over strict normalization, especially in read-heavy workloads.
When strong relational integrity, foreign keys, or normalization are required.
When you frequently filter or aggregate on deeply nested fields (relational structures perform better).
When large JSON fields lead to unnecessary data transfer and slower updates.
JSON is stored in a binary format (not plain text) optimized for lookup.
The structure includes metadata such as element types and offsets for fast access.
MySQL validates JSON on insert/update — invalid JSON is rejected.
Large JSON values may be stored off-page by InnoDB, similar to TEXT/BLOB.
Binary storage allows MySQL to parse only required portions instead of scanning entire JSON strings.
MySQL cannot directly index an entire JSON document.
Instead, you index generated (virtual or stored) columns extracted from JSON paths.
These generated columns can be indexed normally (BTREE).
MySQL 8.0 also supports functional indexes, allowing indexing of expressions like JSON_EXTRACT(data, '$.user.id').
Only the generated/indexed part is searchable using the index; the rest of the JSON remains unindexed.
No full-document JSON indexing exists in MySQL (unlike PostgreSQL’s GIN indexes).
This allows MySQL to index and query details->'$.customer.id' efficiently while keeping the flexibility of JSON.
JSON values consume more storage than normalized columns due to metadata overhead.
Updates to JSON columns rewrite the entire JSON document, not just changed parts.
Querying non-indexed JSON paths is slower because MySQL must scan the JSON document.
Indexes on generated columns greatly improve performance but require planning.